Ceph as a scalable alternative to the Hadoop Distributed File System
نویسنده
چکیده
[email protected] THE HADOOP D I S TR I BUTED F I L E System (HDFS) has a single metadata server that sets a hard limit on its maximum size. Ceph, a high-performance distributed file system under development since 2005 and now supported in Linux, bypasses the scaling limits of HDFS. We describe Ceph and its elements and provide instructions for installing a demonstration system that can be used with Hadoop.
منابع مشابه
Personalized Cloud Storage System: A Combination of LDAP Distributed File System
“Cloud computing” gradually flourish, a wide range of distributed storage systems are increasingly diverse, Like of Gluster, Ceph, Lustre, as well as Hadoop, etc.. In this paper, we propose a personal cloud storage system Integrated with pNFS, it can be accessed in parallel for scalable performance. Besides, data backup and failover mechanism are designed. We expect that the function of the pro...
متن کاملImpact of Single Parameter Changes on Ceph Cloud Storage Performance
In a general purpose cloud system efficiencies are yet to be had from supporting diverse applications and their requirements within a storage system used for a private cloud. Supporting such diverse requirements poses a significant challenge in a storage system that supports fine grained configuration on a variety of parameters. This paper uses the Ceph distributed file system, and in particula...
متن کاملA Scalable RDF Data Processing Framework based on Pig and Hadoop
In order to effectively handle the growing amount of available RDF data, scalable and flexible RDF data processing frameworks are needed. While emerging technologies for Big Data, such as Hadoop-based systems that take advantages of scalable and fault-tolerant distributed processing, based on Google’s distributed file system and MapReduce parallel model, have become available, there are still m...
متن کاملComparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications
Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications ...
متن کاملOf Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search
This paper describes Ivory, an attempt to build a distributed retrieval system around the open-source Hadoop implementation of MapReduce. We focus on three noteworthy aspects of our work: a retrieval architecture built directly on the Hadoop Distributed File System (HDFS), a scalable MapReduce algorithm for inverted indexing, and webpage classification to enhance retrieval effectiveness.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010